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Abstract 


The  problem  of  detecting  a  signal  known  except  for  amplitude  in  incompletely 
characterized  colored  non-Gaussian  noise  is  addressed.  The  problem  is  formulated 
as  a  testing  of  composite  hypotheses  using  parametric  models  for  the  statistical 
behavior  of  the  noise.  A  generalized  likelihood  ratio  test  is  employed.  It  is  shown 
that  for  a  symmetric  noise  probability  density  function  the  detection  performance 
is  asymptotically  equivalent  to  that  obtained  for  a  detector  designed  with  a  priori 
knowledge  of  the  noise  parameters.  Non-Gaussian  distributions  of  the  noise  are 
found  to  be  more  favorable  for  the  purpose  of  detection  as  compared  to  the  Gaussian 
distribution. 


I.  Introduction 

The  theory  of  detection  of  a  known  signal  in  presence  of  Gaussian  noise  having 
a  known  covariance  matrix  is  well  developed  [Van  Trees  1968],  In  many  applications, 
however,  the  covariance  matrix  is  not  known  a  priori.  This  difficulty  can  be  alleviated 
by  characterizing  the  correlation  pattern  of  the  noise  by  a  simple  model  and  using 
estimates  of  the  model  parameters  to  design  a  detector  [Whalen  1971],  [Bowyer  et  al 
1979],  [Kay  1983].  The  difficulty  increases  when  full  information  regarding  the  noise 
probability  density  function  (PDF),  usually  assumed  to  be  Gaussian,  is  unavailable 
due  to  insufficient  knowledge  about  the  noise  source  [Knight  et  al  1981].  There  is  no 
uniformly  most  powerful  (UMP)  test  in  this  case  because  the  use  of  a  Neyman-Pearson 
criterion  leads  to  a  detector  which  depends  on  the  unknown  parameters.  The  Bayesian 
method  of  assigning  priors  to  the  unknown  parameters  of  the  noise  PDF  produces  an 
‘optima]’  detector  [Lee  et  al  1977],  but  requires  a  multidimensional  integration.  Its 
performance  is  critically  dependent  on  the  accuracy  of  the  choice  of  priors.  A  robust 
detector  [Kassam  and  Poor  1985],  on  the  other  hand,  does  not  use  any  partial  knowledge 
about  the  noise  PDF  and  therefore  is  not  expected  to  perform  well.  Locally  optimal 
(LO)  detectors  for  this  problem  have  been  studied  extensively  by  Czamecki,  Martinez, 
Thomas  and  others  [Czarnecki  and  Thomas  1984],  [Martinez  and  Thomas  1982].  Their 
results  however  rely  on  a  known  covariance  matrix  and  marginal  PDF  of  the  noise.  A 
third  dimension  is  added  to  the  problem  if  the  amplitude  of  the  signal  is  not  known 
[Kay  1985].  A  locally  optimal  detector  can  not  be  used  since  it  depends  on  the  polarity 
or  sign  of  the  amplitude,  which  is  usually  unknown. 

This  paper  addresses  the  problem  of  detecting  a  deterministic  signal  known  except 
for  amplitude  in  the  presence  of  incompletely  characterized  non-white  non-Gaussian 
noise.  The  approach  chosen  here  is  to  use  the  theory  of  the  generalized  likelihood  ratio 
test  (GLRT)  for  composite  hypothesis  testing  [Kendall  and  Stuart  1979].  The  work 
presented  here  is  an  extension  of  the  work  of  Kay  [1985]  in  which  the  noise  is  assumed 
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to  be  non-Gaussian  but  white.  In  this  case  the  covariance  matrix  is  assumed  to  be 
known  except  for  a  few  parameters.  Maximum  likelihood  estimates  (MLE)  for  these 
parameters  are  then  used  in  the  GLRT.  The  asymptotic  performance  of  the  GLRT 
detector  is  shown  to  be  equivalent  to  the  asymptotic  performance  of  the  clairvoyant 
GLRT  detector  (one  which  uses  perfect  knowledge  of  the  unknown  parameters)  for  a 
symmetric  noise  PDF.  Therefore  the  GLRT  asymptotically  achieves  an  upper  bound 
in  performance  and  is  optimal  in  this  sense. 

The  paper  is  organized  as  follows.  Section  II  gives  the  theory  of  the  GLRT  which 
will  be  used  extensively  in  the  subsequent  sections.  Section  III  formulates  the  detec¬ 
tion  problem  and  derives  the  GLRT  for  it.  The  case  of  autoregressive  (AR)  noise  is 
considered  separately.  Section  IV  discusses  the  performance  of  the  GLRT  detector  and 
compares  it  to  that  of  the  clairvoyant  GLRT  detector.  Section  V  draws  some  general 
conclusions  about  the  performance  of  the  GLRT  .  Section  VI  summarizes  the  results 
and  discusses  the  implementation  aspect  of  the  problem. 


II.  Review  of  Generalized  Likelihood  Ratio  Test 


Consider  the  problem  of  testing  the  value  of  the  parameter  0  =  [©jf  T  based 

on  the  a  data  set  y  =  [j/i  t/2  •  •  •  Vn]-  ©r  and  0a  are  assumed  to  be  vectors  of  dimension 

r  and  s,  respectively.  A  common  hypothesis  test  is 

*0  :  0T  =  [0r  ©J] 

Hi  ■■  eT  =  [ef  ©f]  er  /  o  (1) 

®aj  referred  to  as  the  vector  of  nuisance  parameters,  is  of  no  concern  and  may  assume 


any  value.  Assuming  the  observed  data  y  has  a  joint  probability  function  f(y;  0r,  0a), 
a  generalized  likelihood  ratio  test  for  testing  (1)  is  to  decide  if 


0  f(y;©r,0a)  _ 

*-G  =  — - s-r1  >  7 


(2) 


f(y;o,  0a) 

for  some  threshold  7.  0  is  an  r-dimensional  vector  of  zeroes.  0a  is  the  MLE  of  03 
assuming  is  true  while  0r  and  ©a  are  joint  MLE’s  of  0r  and  0a  assuming  is 
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true.  0,  is  found  by  maximizing  f(y;O,03)  over  Qs.  Similarly,  0r,  03  are  obtained 
by  maximizing  f(y;0r,03)  over  0r  and  0a. 

The  statistics  of  Lq  are  difficult  to  obtain  in  general.  For  large  data  records  (asymp¬ 
totically)  it  may  be  shown  that  21  nic  is  distributed  in  the  following  maimer  [Kendall 
and  Stuart  1979]. 


2  In  ~  Xr  undergo  (3a) 

21n£c  ~  X,2(ri  A)  under  Hi  (36) 

Here  \r  represents  a  chi-square  distribution  with  r  degrees  of  freedom  and  x'2(r,A) 
represents  a  noncentral  chi-square  distribution  with  r  degrees  of  freedom  and  noncen¬ 
trality  parameter  A.  Note  that  x'2(r,0)  =  xf  or  the  distribution  under  M0  is  a  special 
case  of  the  distribution  under  and  occurs  when  A  =  0.  The  noncentrality  parameter 
A>  which  is  a  measure  of  the  discrimination  between  the  two  hypotheses,  is  given  by 


A  =  0?  [Ie,e,  (0,0,)-  Ie,e.  (0, 0.)!^,  (0, 0,)lgre_  (0, 0,)]  0, 


(4) 


where  0r,  0a  are  the  true  values.  The  terms  in  the  brackets  of  (4)  are  found  by 
partitioning  the  Fisher  information  matrix  for  0  as 


1(0)  = 

and  the  partitions  axe  defined  as 

I©r©r  (0r>  ©a)  —  E 


I©r©r  (®r  J  0«)  l©r©.(0r,0a) 

l©.©r(0r,03)  !©.©.(0r,0a) 


/ainfV/ainfV 

V  dQr  7  V  dQr  ) 


fd\nf\fd\nf\r 

V  der  dQe  ) 


I©r©,  (0r>  0fl)  —  E 

I©,er  (0r>  0a)  =  I©re,  (0r>  ®a) 
l©.©.(0r,0a)  =  E 


r  x  r 


r  x  s 


s  x  r 


fd lnf\  / d\nf\T 
{dQa){d@s)  SXS 


(5) 


(6) 
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All  the  partitions  of  the  Fisher  information  matrix  are  evaluated  at  ©r  =  0  and  the 
true  values  of  0S  for  use  in  (4). 

The  motivation  for  using  a  GLRT  is  that  for  large  data  records  it  exhibits  certain 
optimality  properties.  A  uniformly  most  powerful  (UMP)  test  does  not  exist  in  many 
situations.  However,  of  all  the  tests  which  are  invariant  to  a  natural  set  of  transfor¬ 
mations  the  GLRT  exhibits  the  largest  probability  of  detection.  The  GLRT  is  said  to 
be  the  asymptotically  uniformly  most  powerful  invariant  (UMPI)  test  [Lehmann  1959]. 
It  is  also  a  consistent  test  in  the  sense  that  the  probability  of  deciding  when  is 
actually  true  approaches  0  for  large  data  records.  Asymptotically  the  GLRT  is  unbi¬ 
ased,  i.e.,  the  probability  of  detection  when  Ui  is  true  is  larger  than  the  probability 
of  false  alarm.  (This  result  follows  from  (3)  and  properties  of  the  chi-square  distribu¬ 
tion.)  Finally,  although  the  GLRT  does  not  usually  exhibit  a  constant  false  alarm  rate 
(CFAR)  it  does  so  for  large  data  records.  It  is  difficult  to  find  the  conditions  under 
which  the  asymptotic  results  apply  to  finite  length  data  records.  The  following  heuristic 
conditions  follow  from  [Cox  and  Hinkley  1974]. 

1)  The  asymptotic  statistics  of  the  MLE’s  used  in  the  likelihood  ratio  should 
be  applicable,  i.e.,  they  should  be  Gaussian  with  mean  equal  to  the  true 
parameter  value  and  covariance  matrix  equal  to  the  inverse  of  the  Fisher 
information  matrix. 

2)  The  two  hypotheses  should  be  reasonably  close  and  only  slight  departures  of 
0r  from  zero  should  be  tested. 

III.  Formulation  of  the  Problem  and  GLRT  Solution 
The  detection  problem  considered  here  is  the  following. 

X0  :  y  =  Wu 

(7) 

)/i  :  y  =  Wu  +  ps 

where  s  =  [si  S2  •  •  •  sjv]t  is  a  vector  of  known  signal  amplitudes,  u  =  [ui  U2  •  •  •  un}t 
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is  a  vector  of  independent  and  identically  distributed  ( i.i.d .)  noise  with  a  symmetric 
PDF,  p  is  an  unknown  scalar  (either  positive  or  negative)  and  W  is  an  invertible 
(N  x  N)  matrix  whose  elements  are  functions  of  a  set  of  unknown  parameters  0  = 
[0i  02  •••  0m]. 

lW],y=wty(*) 

Since  un?  n  =  1, 2,  •  •  •  N  are  i.i.d.,  the  PDF  of  u  can  be  expressed  as 

N 

f(u;$)  =  J]7(un;$)  (8) 

n=  1 

where  /(un;  $)  is  the  marginal  PDF  of  each  un  dependent  on  the  unknown  parameter 
vector  $.  /  is  assumed  to  be  symmetric,  i.e.,  f(-u)  =  f(u).  Note  that  the  covariance 
matrix  of  the  noise  is  a2WWr  where  cr2  is  the  variance  of  un. 

(7)  represents  a  general  set  of  problems.  The  unknown  matrix  W  allows  for  a 
large  class  of  spectral  characteristics  or  correlation  patterns  of  the  background  noise. 
For  large  data  records  autoregressive  (AR),  moving  average  (MA)  and  autoregressive 
moving  average  (ARMA)  processes  can  be  represented  by  the  above  formulation  of  the 
underlying  random  process  if  W  is  the  impulse  response  matrix  of  the  corresponding 
filter.  Secondly,  the  PDF  of  un  can  be  chosen  to  characterize  specific  problems  in  a 
realistic  way.  The  parameter  vector  $  is  left  unknown  in  order  to  add  Gexibility  to  the 
noise  PDF  model.  Thirdly,  by  allowing  p  to  be  positive  or  negative  the  detector  will 
be  able  to  accommodate  a  change  of  polarity  in  the  signal. 

The  problem  of  (7)  can  be  recast  as 

Ho  =  ©T  =  [0T  0f]  (9a) 

Hi  =  0T  =  [Qj  ©r]  ©r  ^  0  (96) 

where 

©r  =  p  a  scalar 

_  .  T  m.m  ( 111) 

0a  =  tW  $  j  (vector  of  nuisance  parameters) 
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Since  (9)  is  equivalent  to  (l),  the  GLRT  for  testing  Mi  vs.  M0  is  given  by  (2).  In 
order  to  evaluate  the  MLE’s  it  is  necessary  to  find  the  joint  PDF  of  y  under  either 
hypothesis  which  can  be  found  from  the  joint  PDF  of  u  in  the  following  way.  From  (7) 
it  follows  that 


u-W  V  under  Mq  (lla) 

u  =  W-1(y  —  fj,s)  under  M\  (116) 

W-1  exists  because  W  is  assumed  to  be  invertible.  The  elements  of  W-1  are  also 
known  functions  of  \F. 

(11)  being  an  affine  transformation,  the  joint  PDF  of  y  can  be  written  as 

1 


f(y;tf,$)  = 
f(y;^,$)  = 


|det(W)| 

1 


|det(W)  | 

which  in  view  of  (8)  and  (11)  reduces  to 
f (y,  v, $)  -  |det^wj|  jg (/K; $) 

=  |det(W)|nn(/K^) 


f(u;$) 

f(u;$) 


u=W“1y 


under  Mq 


under  M\ 
u=W-i(y-M8) 


u“=£"=1 
u"  =£f=  i  (*)  ^ 


Therefore  the  GLRT  for  this  problem  is  to  decide  Mi  if 

lift (4) (yy  - h$j) ;  I 


Lg  = 


n=l  \j  =  l 


n/fE^wwi* 

n=l  \y=l 


>7 


(12) 


under  Mq 

(13a) 

under  Mi 

(136) 

(14) 


where  hat  s  denote  MLE  under  Mo  and  double  hat’s  denote  MLE  under  Mi .  It  is  assumed 
that  the  values  of  |det(W)|  under  Hq  and  M\  are  nearly  the  same.  This  assumption 
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simplifies  the  problem  considerably.  The  threshold  7  is  adjusted  to  achieve  a  given 
probability  of  false  alarm,  as  will  be  discussed  in  the  next  section. 

Note  that  if  ^  is  known  so  that  W_1y  can  be  computed,  then  (7)  reduces  to 

Ho  :  W-1y  =  u 

Hi  :  W_1y  =  u  +  /iW’1s 


which  is  simply  the  problem  of  detecting  the  transformed  signal  W_1s  of  unknown  am¬ 
plitude  ^  in  i.i.d.  noise  from  the  transformed  observation  vector  W-1y-  The  likelihood 


ratio  corresponding  to  the  GLRT  for  this  problem  is 

N  /  N 


t-G  = 


II  f  (*)  (yj  -  fas) ;  * 

ft=i  Vj=i _ _ 

N  (  N  \ 

11/ 

n=l  \j=l  j 


The  same  statistic  is  used  for  the  case  of  unknown  #  by  replacing  it  with  its  MLE 
under  the  respective  hypotheses  for  numerator  and  denominator  as  per  (14). 

Another  special  case  of  (7)  arises  when  the  noise  is  white,  i.e.,  W  =  I,  where  I  is 
the  identity  matrix.  (14)  then  reduces  to  [Kay  1985] 

J[f({Vn-fan)]Q) 

^  =  - - -  >T 

n/(™*) 

n=l 

It  was  indicated  earlier  in  this  section  that  the  linear  model  (7)  is  capable  of 
representing  the  case  of  AR  noise  for  large  data  records.  The  advantage  of  AR  modeling 
of  the  noise  as  opposed  to  an  ARMA  or  MA  model  is  that  it  is  easier  to  estimate  the 
unknown  parameters  as  required  by  the  GLRT.  This  case  is  now  examined  in  detail. 
The  detection  problem  for  AR  noise  is 


H0  :y  =  x 
Hi  :  y  =  x  +  ixs 


(15) 
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with 


X  =  [xi  Z2  •  •  •  XN}T 


It  is  assumed  that  the  sequence  {xux2,  •  •  -,xjv}  is  the  output  of  a  pth  order  all-pole 
filter  excited  by  white  driving  noise  or 


p 

xn  —  ~  djXn-j  -j- un,  n  =  l,2,-“N 

3= i 

alternately, 

p  p 

%n  "h  ^  ^  QjjXn—  j  —  ^  fl  =  1,  2,  •  •  •  .N 

J=1  ;=0 

assuming  o0  =  1.  u„  can  also  be  written  as  a  function  of  y  under  either  hypothesis 


p 


—  Vn-ji 

3=0 

V 

n  =  1,2,- --JV 

under  Hq 

(16a) 

~  Y*j  (t/n-j  ~  M-Sn-;')) 
3=0 

n  =  1,2, • • • iV 

under 

(166) 

Note  that  uj,  u2,  •  •  •  up  involves  samples  prior  to  y\  which  are  outside  the  observation 
interval.  These  are  assumed  to  be  zero  for  simplicity.  For  large  data  records  this 
assumption  will  not  change  the  character  of  the  GLRT.  In  the  matrix  form 


«1  \ 

1  ^ 

/  yi-^i  \ 

Up 

ap  ...  1 

Vp 

“p+1 

0  ap  ...  1 

j/p-hi  “* 

Uff  J 

■  0  ...  0  ap  ...  1  j 

’ 

VN  -  PSN  J 

(17) 


under  The  equation  is  the  same  under  X0,  except  that  p  =  0.  (17)  is  a  special  case 
of  (11)  with  'F  =  a  =  [ai  a2  •  •  •  ap]  and  M  —  p.  W-1  is  a  lower  triangular  Toeplitz 


matrix  given  by 


(  0,  if  i  <  j, 

Uij(a)  =  I  a,_y,  if  j  <  i  <  j  +  p, 
l  0,  if  j  +  p  <  i. 
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To  avoid  having  to  assume  that  {y_(p_i) ,  y_(p_2) ,  -  ■  •  ,yo}  are  zero  one  can  proceed 
as  follows.  Considering  only  the  last  (N  -  p)  equations  of  (16)  which  expressed  in  the 
matrix  form  are 


up+i  \ 


1 


p 

u2p+l 


dp  ...  1 

0  ap  ...  1 


UN  j 


\ 


l) 


^  J/p+1  M'sp+ 1  ^ 

V2p  -  M-S2 p 
y2p+l  —  M-S2P+1 

yN  ~  (J.SN  j 


+ 


f  ?  "  \ 

J2aP-j+i{yj  -  Msy) 

3=1 
P 

Y.  aP-j +2{yj  —  ns,) 

3=  2 


dp(yp  -  psp) 

0 

o  7 


(18) 


under  Substitution  of  p  —  0  in  (18)  gives  the  corresponding  equation  for  Mo- 
(18)  is  also  a  special  case  of  (lib)  except  that  only  the  last  ( N  -  p)  of  the  N  scalar 
equations  implied  by  (lib)  are  used.  Since  the  added  vector  causes  a  departure  from 
the  general  model,  the  previous  results  can  not  be  used.  To  determine  the  GLRT  first 
consider  the  conditional  likelihood  function.  In  this  case  the  conditional  likelihood  of 


yP+  i,yp+2,-  •  •  ,yN  given  yi, y2>-  •  •  ,yp  is 

f  (yp+i  >  yp+2,  •  •  • ,  y/v  |yi ,  y2,  •  •  • ,  yP) 


N 


=  II  /(  &yn_;;$ 

n=p-hl  \j~0 


N 


n=p+l  \j=0 


under  Mo 

(19a) 

under  Mi 

(196) 
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The  likelihood  ratio  is  given  by 


n=p+l  \j= 0 


f(yi,y2,---,yP;0,a,$ 


where  o0  and  o0  are  defined  to  be  unity.  The  second  term  is  dropped  for  ease  of 


computation.  A  heuristic  justification  for  ignoring  the  second  term  is  that  when  N  is 
large,  its  contribution  to  tG  will  be  negligible.  The  closer  the  poles  of  the  AR  model 
to  the  unit  circle,  larger  is  the  requirement  for  N  [Box  and  Jenkins  1970],  [Kay  1981], 
With  this  simplification,  the  GLRT  decides  Mi  if 


(20) 


A  comparison  of  (14)  and  (20)  shows  that  the  latter  uses  fewer  terms  in  the  product. 
However  both  formulations  are  clearly  asymptotically  eqivalent.  Figure  1  is  a  block 
diagram  to  generate  2  In  tG  from  the  data.  The  reason  for  computing  2  In  tG  instead  of 
tG  will  be  clear  from  the  discussion  in  the  next  section.  The  block  diagram  is  very  much 
similar  to  that  obtained  by  [Kay  1983]  for  the  detection  of  a  completely  known  signal 
in  unknown  colored  Gaussian  noise.  In  the  Gaussian  case  In  /  is  a  simple  quadratic, 
while  for  the  general  non-Gaussian  case  it  will  be  highly  non-linear.  Figure  1  also  uses 
an  estimator  for  p  which  was  assumed  to  be  known  in  [Kay  1983]. 

IV.  Asymptotic  Performance  of  the  GLRT  Detector 

Asymptotic  distributions  of  2  In  tG  under  Mo  and  M\  axe  given  by  (3a)  and  (36) 
respectively.  In  this  case  ©r  =  p,  0a  =  [tfT  $r]T  for  the  general  linear  model  and 
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(21) 


e„  =  aT  $T]r  for  the  AR  case.  Hence  the  noncentrality  parameter  is 


The  probability  of  false  alarm  is 


Pfa  =  P{2\niG  >7^0} 
where  7'  =  2  In 7.  The  probability  of  detection  is 


(22) 


Pd  =  P{2\nla  >  V|^i} 


(23) 


Both  the  probabilities  can  be  calculated  from  the  tables  of  noncentral  and  central  chi- 
squared  distributions,  respectively.  In  practice,  7'  can  be  set  to  produce  a  given  false 
alarm  rate  and  Pq  can  be  calculated  from  (23)  accordingly. 

As  indicated  before,  there  is  no  UMP  test  for  the  detection  problem  considered  in 
this  paper.  Therefore  there  is  no  upper  bound  to  which  the  performance  of  any  detector 
may  be  compared.  However  the  performance  of  the  GLRT  is  better  appreciated  when 
compared  to  that  of  a  clairvoyant  GLRT.  A  clairvoyant  GLRT  is  one  which  uses  perfect 
knowledge  of  the  nuisance  parameters  0S.  The  likelihood  ratio  in  this  case  is 


which  in  view  of  (13)  is 


where  #  and  $  are  assumed  to  be  known.  Asymptotically,  2  In  tGc  is  distributed  as 


2  In  Iqc  ~  Xr 
2  In  Igc  ~  X>2{r,  Ac) 


under  (24a) 

under  )li  (246) 
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where 

Ac  =  0;Tl©r©r  (0, 0a)0r 

For  the  problem  considered  here,  r  =  1  and  0r  .=  p.  Hence 

Ac  = /x  /MM(0,©,)  (25) 

Comparing  A  and  Ac  as  given  by  (21)  and  (25),  respectively,  it  is  apparent  that  A  is 
equal  to  Ac  less  an  additional  term.  Assuming  that  Iq^q  is  positive  semidefinite, 

Ac  >  A 

From  the  theory  of  noncentral  chi-square  distribution  it  can  be  shown  that  Pd  as  given 
by  (23)  is  a  monotonic  function  of  the  noncentrality  parameter,  which  implies  that  PD 
for  the  clairvoyant  GLRT  is  greater  than  or  equal  to  that  for  the  GLRT  [Sengupta 
1986].  Therefore  the  clairvoyant  GLRT  detector,  although  impractical  in  this  case, 
provides  an  upper  bound  on  the  performance  of  the  GLRT  detector.  In  order  that  the 
upper  bound  be  achieved,  A  should  be  equal  to  Ac.  This  will  occur  if 

©a)  =  0  (26) 

Appendices  A  and  B  show  that  this  is  indeed  the  case  for  the  general  linear  model  of 
(7)  and  the  AR  noise  model  of  (15),  respectively,  if  as  assumed  /  is  a  symmetric  PDF. 
Therefore  the  asymptotic  performance  of  the  GLRT  is  equivalent  to  the  performance 
of  the  clairvoyant  GLRT  for  detection  in  presence  on  non-Gaussian  noise  modeled  as 
in  (7)  or  (15).  This  implies  that  one  can  do  as  well  in  detecting  a  signal  of  unknown 
amplitude  as  if  the  unknown  noise  parameters  were  known. 

V.  General  Conclusions  about  the  Performance  of  the  GLRT 

A  key  to  the  asymptotic  performance  of  the  GLRT  detector  is  the  noncentrality 
parameter  A,  which  is  found  to  be  equivalent  to  Ac.  It  would  be  interesting  to  examine 
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how  A  depends  on  the  statistical  properties  of  the  noise.  First  consider 


O3)  —  E 


/<91nf\2' 

V  an  ) 


=  E 


N  /  N 


ain  (ft/ (fy*  (yj  ~ 


=  E 


N 


d\n  JJ/(un;$) 


vn=l 


dfl 


2n 


N 

=  E£ 

n= 1 


1  2 


In  *) 


using  (116)  and  (12) 


(27) 


The  last  step  follows  from  the  facts  that  un’s  are  t.i.d.  and  that  the  cross-terms  are 
zero,  since 


=  ^[ln/(ttn;$)]  =0 


under  certain  regularity  assumptions  on  /  [Bickel  and  Doksum  1977].  Writing  (116) 
explicitly  as 


N 


un  =  ~>XSj) 

3=1 


it  follows  that 


dun 

dn 


N 


=  i 

j=i 


(28) 


(29) 


Hence  (27)  can  be  rewritten  as 


2" 


(30) 


where  o2  is  the  variance  of  un  and 


//($)  =  E 


d\nf\2 


does  not  depend  on  fj,  or  un.  The  expectation  is  with  respect  to  un  only  since  un’s  are 
i.i.d.  It  follows  from  (25)  that 


=  £(W-*8)’-(W-‘s)a2.r/(*) 

=  ^st(W-1,'w-1)S(72J-/($) 
=  SoR-1S0[cr2J/($)] 


(31) 


where  R  =  <r2WWT  is  the  N  x  N  autocorrelation  matrix  of  the  colored  noise  and 
so  —  M 8  is  signal  vector  including  the  amplitude.  Sq'R- *so  is  the  signal  to  noise 
ratio  (SNR)  at  the  output  of  a  prewhitener  followed  by  a  matched  filter  (or  correlator) , 
both  built  with  perfect  knowledge  of  the  filter  parameters  ('£).  To  be  more  precise, 
if  the  data  is  passed  through  an  ideal  whitener  (  a  filter  which  will  completely  whiten 
the  noise)  and  correlated  (multiplied  term-by-term  and  summed)  with  the  output  of 
a  similar  filter  through  which  only  the  known  signal  is  passed  then  So'R~1Sq  is  the 
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squared  ratio  of  the  contributions  from  the  signal  and  noise  parts  of  the  data.  In  the 
case  of  AR  noise,  a  similar  derivation  using  (166)  (instead  of  (28))  gives 


Infill  ©a) 


i  N  (  P 

^2  ^2  (  ~^2ajSn-j 

n=p+l  y  y= o 


a2If{$) 


(32) 


and 


A  = 


N 


^2  -J2aiSn~3 

n=p+l  \  j  —0 


a2If{$) 


=  ~2  (As)r(As)a2J/($) 


where  A  is  the  ( N  —  p)  x  N  Toeplitz  matrix 

f  Op  ...  a\  1  0 


A  = 


0  ap  ...  ai  1 


Therefore 


0\ 


V  0  ...  0  dp  ...  di  1  j 

A  =  s£r-1s0[£72I/($)] 


(33) 


(34) 


s0  =  fxs  as  before  and  R  =  <72(ATA)_1  is  approximately  the  covariance  matrix  of  the 
noise.  Clearly,  A  is  proportional  to  the  SNR  at  the  output  of  a  prewhitener-correlator 
(using  true  value  of  a)  in  the  AR  case  also. 

Having  established  similar  results  in  the  cases  of  AR  noise  and  the  general  linear 
model,  an  attempt  is  now  made  to  examine  them.  The  AR  noise  model  is  chosen  for 
this  purpose  because  of  its  intuitive  frequency-domain  interpretation.  Figure  2  is  a 
block  diagram  representing  (33).  It  shows  that  A  can  be  obtained  by  inverse  filtering 
the  signal  and  summing  the  squares  of  the  output  of  the  filter.  If  the  signal  has  most 
of  its  power  at  the  frequency  where  the  inverse  filter  has  a  zero,  the  output  power  and 
hence  A  will  be  small  leading  to  a  small  probability  of  detection.  In  other  words,  it 
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is  difficult  to  detect  the  signal  if  the  peaks  of  the  signal  spectrum  coincides  with  the 
peaks  of  the  noise  PSD.  This  makes  perfect  intuitive  sense.  On  the  other  hand  it  is 
possible  to  maximize  A  by  choosing  a  suitable  signal  s  for  a  given  noise  background. 
This  can  be  done  by  constraining  the  signal  energy  to  be  constant  and  maximizing  the 
SNR  at  the  output  of  a  prewhitener-correlator  over  all  possible  signal  shapes.  Writing 
the  Toeplitz  matrix  R  in  terms  of  its  orthonormal  eigenvectors  {vi,  V2,  •  •  • ,  vjv}  and 
eigenvalues  {Ai,  A2,  •••,  hN} 


it  follows  that 


Hence 


N 


R  =  ]CA7V7VJ 

7= 1 


(35) 


where  <7y  so i'vy  is  the  component  of  the  signal  so  along  the  eigenvector  vy.  The 

condition  of  constant  signal  energy  can  be  written  as 


So  So  =  Ps 


(37) 


Since  the  eigenvectors  axe  orthonormal 


=  Y,sov3yjso  =  So 
7=1  7=1 


N 

£ 

7=1 


vivJ 


So  =  s£s0  =Pa 


(38) 


The  SNR  given  by  the  weighted  sum  (36)  has  to  be  maximized  subject  to  constraint 
that  the  unweighted  sum  of  the  squares  is  fixed  at  Pa  as  in  (38).  In  general,  R  will 
be  positive  definite  and  all  the  eigenvalues  will  be  positive.  If  there  exists  a  minimum 
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eigenvalue  A*,  then  the  SNR  is  maximized  by  choosing  =  y/P^  and  cy  =  0  for 
j  ^  fc,  i.e.,  by  choosing  s0  to  be  proportional  to  v*.  Since  the  probability  of  detection 
given  by  (23)  is  a  monotonic  function  of  A  which  is  proportional  to  the  SNR  at  the 
output  of  a  prewhitener-correlator,  the  above  choice  of  the  signal  shape  for  a  given 
signal  energy  also  maximizes  the  probability  of  detection.  If  one  of  the  eigenvalues  Afc 
is  zero,  then  it  is  possible  to  chose  the  signal  in  such  a  way  that  there  is  no  component 
of  noise  along  the  signal  vector  and  therefore  the  SNR  is  infinite  giving  rise  to  singular 
detection.  Therefore  the  probability  of  detection  is  maximized  by  choosing  the  signal 
in  the  direction  of  the  smallest  noise  component.  This  is  the  discrete  time  equivalent 
of  a  well-known  result  for  the  continuous  case  [Van  Trees  1968].  An  interesting  special 
case  occurs  when  N  — *  oo  such  that  the  eigenvectors  become 


v,  -+  —=(l  ej27r/>'  ••• 

Hence  the  optimum  signal  is  a  sinusoid  in  the  direction  of  the  eigenvector  associated 
with  the  minimum  eigenvalue.  For  very  large  data  records  the  eigenvalue  Ay  correspond¬ 
ing  to  the  eigenvector  vy  approaches  the  value  of  PSD  at  the  frequency  /y.  Hence  the 
signal  easiest  to  detect  would  be  a  sinusoid  at  the  frequency  at  which  the  noise  PSD 
has  a  minimum.  This  is  also  apparent  from  the  frequency  domain  equivalent  of  (36) 
(using  Perseval’s  theorem) 


where  S0(f )  is  the  signal  spectrum  (Fourier  transform  of  s0)  and  Puu  is  the  noise  PSD. 
With  the  constraint  that 

/°  5  \So{f)\2df  =  1 

J- 0.5 

which  is  equivalent  to  (37),  the  integral  is  maximized  if  the  numerator  is  large  only 
where  the  denominator  is  small  or  zero.  This  result  has  a  nice  intuitive  justification. 
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However  if  the  filter  parameters  are  completely  unknown  the  above  result  can  not  be 
used  to  select  a  suitable  signal. 

The  next  issue  of  interest  is  the  effect  of  the  noise  PDF  on  the  detection  perfor¬ 
mance.  A.  reasonable  basis  of  comparison  should  be  formed  for  this  purpose.  Therefore 
the  Gaussian  and  non-Gaussian  noise  processes  axe  assumed  to  have  the  same  PSD, 
i.c.,  the  same  spectral  shape  and  power  and  detection  of  the  same  signal  is  considered. 
Consequently,  the  comparison  is  done  on  the  basis  of  a  fixed  signal  to  noise  ratio  (or 
Sq  R  So).  Dnder  these  assumptions,  the  probability  of  detection  is  larger  for  that  noise 
PDF  which  has  a  larger  value  of  o2lj.  In  other  words,  given  two  noise  backgrounds 
with  the  same  PSD  but  different  underlying  noise  PDF’s,  in  order  to  achieve  the  same 
probability  of  detection,  more  SNR  is  required  for  that  background  for  which  o2If 's 
smaller.  It  is  known  that  among  all  symmetric  and  integrable  PDF’s,  the  Gaussian 
PDF  is  the  only  one  for  which  a2 1 f  attains  its  minimum  value  of  unity  [Sengupta  and 
Kay  1986].  Therefore  for  a  given  noise  PSD,  it  is  easier  to  detect  a  signal  known  except 
for  amplitude  in  non-Gaussian  noise  than  in  Gaussian  noise.  From  (34)  it  follows  that 
in  order  to  have  the  same  noncentrality  parameters  in  the  non-Gaussian  and  Gaussian 
cases 

SMZNG{o2If)  =  SMZG 


where  SNRng  and  SSIRg  are  the  SNR’s  required  in  non-Gaussian  and  Gaussian 
noises,  respectively,  in  order  to  achieve  a  given  probability  of  detection  (»'.e.,  a  given 
A).  The  above  equation  can  also  be  written  as 

10 bgl°  SMZng  =  10  logl0^2 (39) 


Therefore  10log10(cr2 If)  is  a  measure  of  the  SNR  bonus  in  dB  for  a  non-Gaussian 
distribution.  The  result  also  holds  for  the  special  case  of  white  noise  when  (33)  becomes 


A  = 


n=l 


(40) 
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The  quantity  a2 If  is  now  shown  to  be  independent  of  scaling.  If  the  random 
variable  u  has  a  PDF  f(u)  then  the  normalized  random  variable  ii  —  ujo  has  the  PDF 


ff(u)  =  af(u) 


hence 


[/'MF 


du 


fiu) 

2  f°°  [a^_o_a 

l-oo  1  ,!h 


-du 


-Mz) 


=sr±miadi 


-r 


(g 

g{u) 


du 


Hence  a2  If  depends  only  on  the  shape  of  the  PDF  and  is  unaffected  by  scaling.  There¬ 
fore  the  SNR  bonus  quantified  by  (39)  is  the  same  for  any  value  of  the  noise  power  as 
long  as  the  powers  of  the  non-Gaussian  and  Gaussian  processes  are  the  same 

It  is  interesting  to  note  that  the  same  quantity  represents  the  amount  of  departure 
from  Gaussianity  for  the  problem  of  estimating  AR  filter  parameters  of  a  non-Gaussian 
AR  process.  The  CR  bounds  for  these  parameters  are  found  to  be  less  in  the  case  of  a 
non-Gaussian  PDF  than  the  corresponding  bounds  in  the  Gaussian  case  by  a  factor  of 
a2  If  [Senguptaand  Kay  1986]. 

As  an  illustration  of  the  improvement  made  by  the  proposed  detector  over  the 
Gaussian  detector,  consider  the  mixed- Gaussian  noise  PDF 


/(u)  = 


1  —  e 


+ 


re  ^ 


yj2TT(J2B  \J2/ko\ 

The  first  term  on  the  right  hand  side  is  referred  to  as  the  background  component  with 
variance  o'q  and  the  second  term  is  called  the  interference  component  with  variance 
°j.  6  is  called  the  mixture  parameter  and  is  regarded  as  a  measure  of  the  degree  of 
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contamination  of  the  background  Gaussian  process  by  the  interference  process.  The 
model  is  useful  in  representing  a  nominally  Gaussian  noise  background  characterized  by 
the  presence  of  sharp  spikes  or  impulses  [Sengupta  and  Kay  1986].  Assuming  a\  =  1 
and  o]  =  1000,  Figure  3  plots  the  SNR  bonus  given  by  (39)  vs.  e  (in  this  case  $  =  e).  It 
shows  how  much  improvement  can  be  expected  over  the  Gaussian  case  in  terms  of  SNR 
while  detecting  a  signal  known  except  for  amplitude  in  colored  noise.  The  comparison 
is  made,  as  indicated  before  on  the  basis  of  the  same  PSD  in  the  Gaussian  and  mixed- 
Gaussian  cases.  It  should  be  mentioned,  however,  that  introduction  of  impulses  in  an 
otherwise  Gaussian  environment  does  not  improve  the  probability  of  detection,  which 
is  expected  intuitively.  This  is  because  of  the  fact  that  introduction  of  impulses  also 
increases  the  noise  power  by  a  considerable  amount.  This  increase  in  noise  power  is 
allexnatedhy  employing  a  non-Gaussian  detector.  As  an  example,  for  e  =  0.1,  the  overall 
noise  variance  is  approximately  100<r|  (as  compared  to  o\  before  the  introduction  of 
impulses) , the  noise  power  increases  by  20  dB.  It  can  be  observed  from  Figure  3  that 
the  SNR  bonus  is  also  approximately  20  dB  for  e  =  0.1.  Therefore  the  mixed-Gaussian 
detector  does  not  suffer  from  a  loss  of  performance  unlike  the  Gaussian  detector  whose 
threshold  of  detection  is  expected  to  go  down  considerably  with  the  introduction  of 
impulses. 

VI.  Summary 

The  GLRT  for  the  detection  of  a  signal  known  except  for  amplitude  in  unknown 
colored  non-Gaussian  noise  was  derived  in  section  III  through  parametric  modeling  of 
the  noise  PDF  and  covariance  matrix.  The  popular  time  series  models  such  as  AR, 
MA  and  ARMA  for  the  noise  are  asymptotically  special  cases  of  the  proposed  linear 
model  for  large  data  records.  The  GLRT  was  found  to  achieve  the  performance  of  a 
clairvoyant  GLRT  asymptotically,  i.  e.,  knowledge  of  the  nuisance  parameters  is  not 
required  to  attain  an  upper  bound  in  performance.  The  effects  of  the  signal  spectrum 
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and  the  noise  PSD  on  the  detection  performance  was  discussed.  It  was  observed  that 
it  is  difficult  to  detect  a  signal  whose  spectrum  matches  the  noise  PSD.  If,  however, 
most  of  the  signal  is  along  a  direction  of  low  noise  component,  it  is  very  easy  to  detect. 
The  asymptotic  performance  of  the  GLRT  for  Gaussian  and  non-Gaussian  noise  models 
were  compared.  It  was  concluded  that  detection  in  non-Gaussian  noise  is  easier  than 
detection  in  Gaussian  noise  for  the  same  noise  PSD.  The  improvement  in  performance 
of  the  GLRT  in  a  non-Gaussian  noise  background  over  the  Gaussian  case  is  easily 
quantified  in  terms  of  the  SNR  ‘bonus’  as  a  function  of  the  noise  PDF  parameters. 

In  order  to  implement  the  GLRT  described  in  section  III  one  needs  to  find  the 
MLE  s  of  the  unknown  parameters  under  each  hypothesis.  Some  work  along  this  line 
has  been  done  for  the  case  of  AR  noise  (Sengupta  and  Kay  1986,  2],  using  reasonable 
approximations  to  reduce  computation.  This  work  is  approriate  for  estimation  under 
the  null  hypothesis  ()/i)  and  extension  to  the  case  of  alternative  hypothesis  (Mi)  is  not 
straightforward  except  for  the  special  case  of  a  d.c.  signal  (s;  =1,  j  =  1, 2,  •  •  • ,  TV). 
Evaluating  the  joint  MLE  of  the  mean  or  the  location  parameter  (/t)  and  the  AR  filter 
parameters  may  be  particularly  difficult  for  most  non-Gaussian  processes.  Computa¬ 
tionally  efficient  approximations  to  the  GLRT,  such  as  the  Rao  test  and  the  Wald  test 
[Rao  1973]  can  be  used  for  this  purpose.  Estimation  of  the  mean  and  the  other  param¬ 
eters  under  Mi  can  thus  be  avoided  for  small  signal  amplitudes  [Sengupta  1986].  This 
problem  will  be  addressed  in  a  future  paper. 

References 

[1]  H.L.  Van  Trees,  Detection,  Estimation,  and  Modulation  Theory,  Chapter  4, 
New  York:  John  Wiley,  1968. 

[2]  A.D.  Whalen,  Detection  of  Signals  in  Noise,  Chapter  9,  New  York:  Academic, 
1971. 


22 


[3j  D.E.  Bowyer  et  al,  ‘Adaptive  Clutter  Filtering  using  Autoregressive  Spectral 
Estimation”,  IEEE  Trans,  on  Aerosp.  Electron.  Syst.,  pp.  538-546,  July  1979. 

[4]  S.M.  Kay,  “Asymptotically  Optimal  Detection  in  Unknown  Colored  Noise  via 
Autoregressive  Modeling”,  IEEE  Trans,  on  Acoustics,  Speech  and  Signal  Processing, 
pp.  927-940,  Vol.  ASSP-31,  Aug.  1983. 

[5]  W.C.  Knight,  R.G.  Pridham  and  S.M.  Kay,  “Digital  Signal  Processing  for 
Sonar”,  Proc.  of  the  IEEE,  pp.  1451-1506,  Nov.  1981. 

[6]  S.C.  Lee,  L.W.  Nolte  and  C.P.  Hatsell,  “A  Generalized  Likelihood  Ratio  For¬ 
mula:  Arbitrary  Noise  Statistics  for  Doubly  Composite  Hypotheses”,  IEEE  Trans,  on 
Info.  Theory,  pp.  637-639,  Vol.  IT-23,  Sept.  1977. 

Ji  S.A.  Kassam  and  H.V.  Poor,  “Robust  Techniques  for  Signal  Processing”,  Proc. 
of  the  IEEE,  pp.  433-481,  Vol.  73,  Mar.  1985. 

[8]  A.B.  Martinez  and  J.B.  Thomas,  “Non-Gaussian  Multivariate  Noise  Models  for 
Signal  Detection”,  ONR  report  #6,  Sept.  1982. 

[9]  S.V.  Czarnecki  and  J.B.  Thomas,  “Nearly  Optimal  Detection  of  Signals  in  Non- 
Gaussian  Noise”,  ONR  report  #14,  Feb.  1984. 

[10]  S.M.  Kay,  Asymptotically  Optimal  Detection  in  Incompletely  Characterized 
Non-Gaussian  Noise”,  Submitted  to  IEEE  Trans,  on  Acoustics,  Speech  and  Signal 
Processing,  1985. 

'll]  Sir  M.  Kendall  and  A.  Stuart,  The  Advanced  Theory  of  Statistics  Vol.  II , 
Chapters  18—19,  New  York:  MacMillan  Publishing,  1979. 

[12]  E.L.  Lehmann,  Testing  Statistical  Hypotheses,  New  York:  John  Wiley,  1959. 

[13]  D.R.  Cox  and  D.V.  Hinkley,  Theoretical  Stastics,  Chapter  4,  London:  Chap¬ 
man  and  Hall,  1974. 

[14]  G.E.P.  Box  and  G.J.  Jenkins,  Time  Series  Analysis:  Forecasting  and  Control, 
Chapter  7,  Sam  Francisco:  Holden-Day,  1970. 

[15]  S.M.  Kay,  “More  Accurate  Autoregressive  Parameter  and  Spectral  Estimates 


23 


for  Short  Data  Records”,  presented  at  the  ASSP  workshop  on  Spectral  Estimation, 
Hamilton,  Onterio,  Canada,  Aug.  17-18,  1981. 

[16]  D.  Sengupta,  Estimation  and  Detection  for  Non-Gaussian  Processes  using 
Autoregressive  and  Other  Models”,  M.S.  Thesis,  Dept,  of  Electrical  Engineering,  Univ. 
of  Rhode  Island,  1986. 

[17]  P.J.  Bickel  and  K.A.  Doksum,  Mathematical  Statistics:  Basic  Ideas  and  Se¬ 
lected  Topics ,  Chapter  4,  San  Francisco:  Holden-Day,  1977. 

[18]  D.  Sengupta  and  S.M.  Kay,  “Efficient  Estimation  of  Parameters  for  Non- 
Gaussian  Autoregressive  Processes” ,  submitted  for  review  to  IEEE  Trans,  on  Acoustics, 
Speech  and  Signal  Processing. 

[19]  S.M.  Kay  and  D.  Sengupta,  “Simple  and  Efficient  Estimation  of  Parameters 
of  Non-Gaussian  Autoregressive  Processes”,  submitted  for  review  to  IEEE  Trans,  on 
Acoustics,  Speech  and  Signal  Processing. 

[20]  C.R.  Rao,  Linear  Statistical  Inference  and  its  Applications ,  Chapter  6,  New 
York:  John  Wiley,  1973. 


Appendix  A 

Asymptotic  Optimality  of  the  GLRT 
for  a  General  Linear  Model  of  the  Noise 


Assuming  that  /  is  an  even  distribution  and  08  is  as  given  in  (10),  it  will  now  be 
shown  that  (26)  holds  for  the  detection  problem  defined  in  (7).  It  suffices  to  prove  that 
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To  prove  (A.l)  it  is  observed  that 
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Wik  is  written  without  its  argument  (see  (12))  to  make  the  notation  easier 
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un  can  be  used  as  the  argument  of  /  (see  (28))  to  simplify  the  equation. 
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All  the  cross-terms  are  zero  because  un’s  are  i.i.d.  and 
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(A.5) 


under  certain  regularity  assumptions  on  /  [Bickel  and  Doksum  1977].  The  derivatives 
w.r.t.  /x  and  u>ik  can  be  written  in  terms  of  the  derivative  w.r.t.  un.  Note  from  (28) 
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From  (A6)  and  (29)  it  follows  that  (A.4)  can  be  rewritten  as 
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(Vk  -  VSk)  is  a  linear  function  of  u,  as  observed  from  (7).  The  PDF  /  is  even  and 
expectation  is  taken  on  a  function  which  is  odd  over  each  un.  Therefore  the  expectation 
must  be  zero. 
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This  is  true  for  each  i  and  k,  so  that 
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(A.l)  follows  directly  from  (A. 3)  and  (A. 7). 

(A. 2)  can  be  proved  in  a  similar  way.  Consider 
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Using  (A. 6)  this  becomes 
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Under  the  assumption  that  /  is  even,  In  /  is  even,  derivative  of  In  /  w.r.t.  un  is  odd  and 
the  derivative  of  In  /  w.r.t.  fa  is  even  [Kay  1985].  Therefore  the  expectation  is  taken 
on  an  odd  function  and  should  be  equal  to  zero  as  explained  while  proving  (A. 7).  This 
being  true  for  each  fa  one  can  conclude  that  (A.2)  holds.  (26)  is  a  direct  implication 
of  (A.l)  and  (A.2). 
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APPENDIX  B 


Asymptotic  Optimality  of  the  GLRT 
for  an  AR  Model  of  the  Noise 


It  is  now  shown  that  (26)  also  holds  in  the  case  of  the  GLRT  given  by  (20)  for 
detection  in  AR  noise.  Note  that  the  conditional  likelihood  function  (see  (19))  is  used 
and  the  vector  of  nuisance  parameters  is 

0a  =  [a  $] 

with  the  notations  used  before.  Proving  (26)  in  this  case  is  equivalent  to  proving  that 
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(16)  can  be  used  to  simplify  the  argument  of  /, 
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The  last  step  follows  from  (A.5)  and  the  fact  that  un' s  are  iid.  The  derivatives  w.r.t. 
H  and  a,  can  be  written  in  terms  of  the  derivative  w.r.t.  un.  From  (166)  it  follows  that 
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Using  these  results  {B. 3)  can  be  rewritten  as 
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{Vn—i  —  ii$n-i)  is  a  linear  function  of  {un_,-, un_t_x,  •  •  • , ui}  and  hence  is  an  odd  func¬ 
tion  of  each  un.  Therefore  the  expectation  is  taken  on  an  odd  function  which  shoud 
be  equal  to  zero  since  the  PDF  /  itself  is  even.  This  being  true  for  each  a,-,  it  can  be 
concluded  that  (B.l)  holds. 

Proof  of  (B. 2)  is  similar.  Consider 
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(using  (5.4)) 

Since  /  is  even,  derivative  of  In /  w.r.t.  un  is  odd  and  that  w.r.t.  fa  is  even.  The 
expectation  is  therefore  taken  on  and  odd  function  and  must  equal  zero.  Since  this  is 
true  for  each  fa,  it  can  be  concluded  that  (5.2)  holds.  Consequently,  (26)  holds  for  the 
case  of  AR  noise  when  the  GLRT  is  computed  on  the  basis  of  conditional  likelihood 
function  as  in  (20). 


30 


Estimate 
under  H 

0 


prewhitener 


A(Z) 


Z  21nf(un) 
n 


+ 


=  21n£ 

U 
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for  a  Mixed-Gaussian  process 
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